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Abstract 

Surveillance  data  from  an  oncology  hospital  unit  on  Vancomycin-resistant  En¬ 
terococcus  (VRE),  one  of  the  most  prevalent  and  dangerous  pathogens  involved  in 
hospital  infections,  is  used  to  motivate  possibilities  of  modeling  nosocomial  infec¬ 
tion  dynamics.  This  is  done  in  the  context  of  hospital  monitoring  and  isolation 
procedures  as  a  prelude  to  the  evaluation  and  improved  design  of  control  measures. 
A  discrete  event  delay  differential  equation  model  in  conjunction  with  statistical 
computational  methods  is  formulated  to  estimate  key  population-level  nosocomial 
transmission  parameters  and  isolation  procedures.  This  framework  is  used  to  test 
the  surveillance  data’s  usefulness  in  model  validation.  In  the  process  of  model  cal¬ 
ibration  we  discovered  significant  irregularities  in  the  available  surveillance  data; 
these  irregularities  are  most  likely  the  result  of  the  data  observational  recording- 
process  as  well  as  those  in  the  isolation  procedures.  Efforts  to  fit  data  within  our 
highly  flexible  dynamic-modeling  framework  suggest  that  clinical-trial  level  surveil¬ 
lance  data  is  needed  if  one  is  to  successfully  develop  quantitative  models  for  disease 
transmission  and  intervention.  It  is  concluded  that  typical  “cold”  data  sets  typ¬ 
ically  encountered  in  biological/sociological  quantitative  modeling  efforts  may  be 
inadequate  for  support  of  serious  model  development. 


Key  Words:  Delay  equations,  discrete  events,  nosocomial  infection  dynamics,  surveil¬ 
lance  data,  inverse  problems,  parameter  estimation. 
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1  Introduction 


One  of  the  more  difficult  endeavors  in  inverse  problems  is  the  development  and  eval¬ 
uation  (validation)  of  mathematical  and  statistical  models  with  so-called  “cold  data”. 
This  data,  often  routinely  available  in  sociological  and  biological  investigations,  which 
are  quantitative  observations  (often  mixed  with  anecdotal  information),  are  typically 
collected  with  no  formal  quantitative  modeling  anticipated.  Rather,  the  collectors  have 
in  mind  some  vague  idea  of  recording  their  observations  in  hope  that  this  data  will  some¬ 
how  be  useful  in  understanding  (often  at  the  dynamical  or  longitudinal  level)  the  process 
(biological,  physical,  sociological,  etc.)  of  interest  to  them.  An  important  contribution 
that  inverse  problem  investigators  can  make  is  to  understand  when  a  data  set  is  or  is 
not  suitable  for  model  validation.  Numerous  quantitative  modeling  tools  (mathematical 
and  statistical)  are  available  to  aid  in  this  effort;  a  brief  review  of  some  of  these  can  be 
found  in  [9,  11,  12,  13]-see  also  [16].  In  this  paper  we  present  a  detailed  example  of  a 
modeling  attempt  based  on  “data”  from  an  isolation  process  at  hospitals  in  the  US.  In 
[31]  we  used  hospital  based  Vancomycin-resistant  Enterococcus  (VRE)  disease  data  to 
motivate  development  of  the  most  “natural”  dynamics  model.  This  entailed  a  stochas¬ 
tic  Markov  process  model  for  transition  between  compartments  consisting  of  healthy 
(uncolonized),  unhealthy  (colonized)  patients  in  the  hospital  population  and  colonized 
patients  in  isolation.  The  use  of  data  with  such  small  population  stochastic  models  in 
parameter  estimation  efforts  is  difficult.  In  [31]  we  presented  a  methodology  for  using 
the  corresponding  limit  (in  large  population)  mean  (sample  path  averages)  population 
dynamic  model  containing  the  transition  parameters  to  estimate  the  transition  proba¬ 
bilities  in  the  stochastic  models.  The  emphasis  in  [31]  was  on  methodology  and  little 
attention  was  given  to  efforts  to  further  develop  the  models  to  obtain  good  fit  to  the 
experimental  data.  Here  we  focus  on  careful  efforts  to  develop  longitudinal  models  to 
describe  the  data  as  understood  by  hospital  workers.  In  particular  we  develop  models 
in  attempts  to  include  precise  events  as  described  in  the  data  and  reported  by  hospital 
workers  in  the  data  acquisition  process.  In  these  efforts  we  attempt  to  model  the  disease 
progression  (the  mathematical  model)  along  with  the  data  collection  procedures  (the  sta¬ 
tistical  model)  as  faithfully  as  possible  in  an  effort  to  decide  if  the  “cold  data”  is  useful 
in  modeling  attempts  or  is  use  of  the  data  with  precisely  formulated  mathematical  and 
statistical  models  somehow  fraught  with  fundamental  difficulties,  i.e.,  is  it  an  “ill-posed 
problem”  in  and  of  itself?  For  example  is  the  data  collection  itself  flawed,  resulting  in 
such  large  variability  that  it  is  essentially  unsuited  for  such  modeling?  If  so,  one  should 
use  experimental  optimal  design  criteria  and  methodology  [3,  11,  12,  17,  25,  26]  to  design 
specific  experiments  (i.e.,  data  collection  protocols)  in  support  of  the  modeling  process. 
Successful  examples  of  such  endeavors  are  given  in  [15,  Chapter  V,  Sec.  6,  7]  and  more 
recently  in  [10]. 

We  begin  with  the  ordinary  differential  equation  system  for  sample  path  averages 
suggested  by  the  stochastic  model  of  [31]  and  modify  the  model  to  include  jump  dis¬ 
continuities  and  time  delays  (both  scheduled)  that  are  reported  in  detailed  accounting 
of  the  hospital  and  data  acquisition  procedures.  We  then  show  how  to  reformulate  the 
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corresponding  delay  system  as  an  abstract  evolution  system  in  a  hereditary  space  so  that 
standard  finite  element  type  approximations  are  applicable.  We  then  use  known  math¬ 
ematical  and  statistical  methodology  for  inverse  problems  to  investigate  the  resulting 
problem  in  the  context  of  the  available  data.  We  first  give  a  summary  of  the  pertinent 
facts  underlying  VRE  infections  and  transmission  in  hospital  settings. 

2  VRE  Background 

Nosocomial,  or  hospital-acquired  infections,  the  fourth  cause  of  death  [1]  in  the  US,  are 
evidence  that  hospitals  provide  not  only  medical  care  but  also  harbor  pathogens  that 
pose  serious,  often  fatal,  risks  of  infection,  particularly  to  the  young,  the  elderly,  and 
immune-compromised  individuals.  Infection-control  measures  aimed  at  reducing  their 
impact  are  being  implemented  with  various  degrees  of  efficiency  at  US  hospitals. 

Hospitals  and  health  care  professionals  are  committed  to  improve  the  health  of  their 
patients.  However,  there  are  risks  associated  with  the  provision  of  health  care  with  one 
of  the  most  important  being  the  acquisition  of  infections  at  hospitals.  The  Centers  for 
Disease  Control  and  Prevention  (CDC)  estimates  that  5%  to  10%  of  patients,  or  more 
than  two  million  patients  each  year  will  get  an  infection  while  in  a  United  States  hospital 
with  about  90,000  of  them  dying  from  such  infections  [21].  These  hospitals-acquired 
infections  or  nosocomial  infections  are  infections  not  present  or  incubating  in  a  patient 
at  the  time  of  admission  to  a  hospital  or  health  care  facility.  Three  decades  ago  infection- 
control  measures  were  put  in  place  to  control  antibiotic-resistant  nosocomial  infections 
and  yet  these  infections  have  continued  to  increase.  Multidrug-resistant  pathogens  have 
become  increasingly  problematic,  especially  in  the  critical  care  setting. 

Most  of  the  nosocomial  infections  are  primarily  caused  by  antibiotic  resistant  pathogens, 
such  as  Vancomycin-resistant  Enterococcus  (VRE).  VRE  is  the  group  of  bacterial  species 
[18]  of  the  genus  enterococcus  that  is  resistant  to  the  antibiotic  vancomycin  and  it  can 
be  found  in  the  digestive/gastrointestinal,  urinary-tracts,  surgical-incision,  and  blood¬ 
stream  sites. 

The  duration  of  colonization  could  last  from  weeks  to  months  [20].  The  factors 
most  associated  in  predisposing  VRE  colonization  to  patients  includes:  a  compromised 
immune  system  or  nutritional  status,  the  use  of  catheters  (such  as  urinary  or  central 
venous),  co-morbidities  (e.g.,  diabetes,  renal  insufficiency,  cancer),  length  of  stay  in  the 
hospital,  inadequate  infection  control  practice  among  health  care  workers  (HCW),  and 
prolonged  antibiotic  used  (>  10  days).  Hence  VRE  patients  admitted  in  hospital  units 
such  as  intensive  care  and  oncology  have  a  greater  colonization  risk. 

Transmission  of  VRE  can  occur  through  contact  with  colonized  or  infected  individ¬ 
uals  (although,  there  are  cases  in  which  VRE  acquisition  may  arise  from  the  patient’s 
own  gut  flora).  The  most  frequent  form  of  transmission  is  by  contact,  categorized  as 
direct-contact  transmission  or  indirect-contact  transmission.  Direct-contact  transmis¬ 
sion  involves  direct  physical  contact  (mostly  hands)  between  a  susceptible  host  and  a 
colonized  agent.  Indirect-contact  transmission  involves  contact  between  a  susceptible 
host  and  a  contaminated  institutional  environment,  that  includes  health  care  workers 
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(human  vectors). 

The  CDC  Hospital  Infection  Control  Program  [22]  encourages  hospitals  to  develop 
their  own  institution-specific  interventions  plans  that  should  stress:  prudent  vancomycin 
use  by  clinicians,  hospital  staff  education  regarding  vancomycin  resistance,  early  detec¬ 
tion  and  prompt  reporting  of  vancomycin  resistance  in  enterococci  by  the  hospital  mi¬ 
crobiology  laboratory,  and  immediate  implementation  of  appropriate  infection  control 
measures  (such  as  isolation)  to  prevent  person-to-person  VRE  transmission.  Isolation 
procedures  consist  mostly  of  frequent  hand  washing  which  is  considered  the  single  most 
important  control  measure. 

Deterministic  and  stochastic  models  have  made  substantial  contributions  to  our  un¬ 
derstanding  of  the  epidemiological  dynamic  of  infections  [2,  19,  27].  Hence,  they  have 
been  valuable  tools  to  predict  and  explain  the  epidemiology  of  nosocomial  infections. 
Many  of  the  models  developed  to  describe  the  transmission  of  nosocomial  infection  in  a 
health  care  setting  have  been  based  on  the  Ross-Macdonald  model  [32]  where  the  trans¬ 
mission  of  pathogens  in  health  care  settings  considers  health  care  workers  as  vectors  and 
patients  as  hosts  [4,  23,  29,  30,  34],  These  models  have  been  used  in  attempts  to  explain 
the  spread  of  infections,  specifically  by  investigations  of  the  impact  of  infection  control 
measures  such  as  patient  isolation,  hand-washing,  and  bacterial-control  among  others. 

In  this  paper  we  discuss  mathematical  models  for  the  transmission  of  VRE  in  a  hos¬ 
pital  unit.  The  development  of  these  models  is  based  on  the  epidemiological  knowledge 
of  VRE  in  a  setting  that  allows  for  the  implementation  of  infection  control  measures  in 
hospitals.  We  focus  on  the  connection  of  these  models  with  unpublished  VRE  surveil¬ 
lance  data  from  an  oncology  unit  in  order  to  estimate  some  of  the  parameters  that  govern 
the  underlying  transmission  infection  dynamics.  The  usefulness  of  our  new  flexible  mod¬ 
eling  framework  for  the  transmission  dynamics  of  nosocomial  infections  like  VRE  was 
first  evaluated  using  synthetic  noisy  data  and  then  tested  against  the  available  data. 
Our  conclusions  for  this  particular  data  set  are  then  detailed. 

3  Surveillance  data 

The  motivating  surveillance  data  is  from  an  oncology  unit,  obtained  from  the  VRE  In¬ 
fection  Control  database  of  the  Department  of  Quality  Improvement  Support  Service 
of  Yale-New  Haven  Hospital.  Data  reports  on  the  number  of  VRE  cases  occurred  on 
admission  (including  patients  transferred),  the  patients’  length  of  stay,  the  daily  number 
of  patients  in  isolation  due  to  VRE  colonization,  the  compliance  of  swab  culture  admin¬ 
istered  on  admission,  and  the  health  care  worker  contacts  precautions  compliance.  Data 
collection  occurred  during  the  period  of  January  2005  to  January  2007  with  a  mean 
number  of  31  in-patients  per  day  (with  a  total  of  37  beds). 

Ward  protocol  required  rectal  swabbing  of  all  patients  on  admission,  and  once  a  week 
thereafter  (every  Tuesday)  for  VRE  surveillance.  Compliance  was  not  100%,  as  the  mean 
percentage  of  swab  cultures  taken  on  admitted  patients  per  day  was  77%.  Swab-test 
results  usually  were  returned  48  hours  after  admission.  If  a  patient  tested  VRE  positive, 
he/she  was  isolated.  The  isolation  procedure  consisted  of  contact  precautions  by  the 
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use  of  gowns,  gloves,  and  the  location  of  a  patient  to  a  single  room  or  to  a  room  with 
another  patient  who  was  also  VRE  positive.  If  a  readmission  patient  had  a  positive 
VRE  culture  in  the  past,  he/she  did  not  receive  the  rectal  swab  on  admission  and  was 
isolated  immediately.  The  isolation  report  was  performed  on  weekdays  (no  weekends 
or  holidays).  The  mean  number  of  isolated  VRE  colonized  patients  per  day  was  9.39 
(std=2.90)  patients. 

4  VRE  epidemic  model 

We  consider  a  deterministic  continuous  time  compartmental  model  in  which  patients  in  a 
hospital  unit  are  classified  as  either  uncolonized  U  (t) ,  VRE  colonized  through  admission 
C\  ( t ) ,  VRE  colonized  during  hospital  stay  C2  ( t ) ,  and  VRE  colonized  patients  in  isolation 
J(t),  as  depicted  in  the  compartmental  schematic  of  Figure  1.  (A  related  Markov  Chain 
stochastic  model  is  presented  in  [31]). 

We  assume  homogenous  mixing,  that  is,  each  patient  is  considered  equally  likely 
to  be  in  contact  with  a  health  care  worker  in  any  time  interval,  equally  likely  to  be 
VRE  colonized,  and,  if  VRE  colonized  at  a  given  time,  equally  likely  to  transmit  the 
pathogen  at  a  given  time.  Patients  are  admitted  to  the  hospital  unit  at  a  rate  A  per 
day  and  some  fraction  m  are  already  VRE  colonized.  Therefore,  VRE  colonized  patients 
through  admission  (C\)  enter  the  hospital  at  a  rate  raA  and  uncolonized  patients  enter 
the  hospital  at  a  rate  (1  — m)A.  VRE  colonized  patients  (Ci,  C2,  and  J)  are  discharged  at 
a  different  rate  from  uncolonized  patients.  Uncolonized  patients  become  VRE  colonized 
at  a  rate  proportional  to  the  prevalence  of  patients  carrying  the  bacteria.  It  is  assumed 
that  an  average  patient  in  the  population  makes  f3N  effective  contacts  (i.e.,  contact 
sufficient  to  lead  to  VRE  colonization)  with  other  patients  per  unit  time  through  health 
care  workers,  where  N  is  the  total  population  size.  Since  the  probability  is  U/N  that  a 
random  contact  by  a  VRE  colonized  patient  is  with  an  uncolonized  patient,  the  number 
of  new  colonization  in  unit  time  per  infective  is  (/ 3N)(JJ/N ).  This  yields  a  rate  of  new 
VRE  colonization  (f3N) (U/N)  [C{  +  C2  +  (1  - 7)  J]  =  0U[Ci  +  C2  +  (1  —  7)  J}.  The  hand- 
hygiene  policy  applied  to  health  care  workers  on  VRE  colonized  patients  in  isolation 
reduces  infectivity  by  a  factor  of  7  (0  <  7  <  1).  This  assumption  means  that  isolated 
VRE  colonized  patients  make  fewer  contacts  than  regular  patients,  so  transmission  of 
the  bacteria  by  these  isolated  members  has  an  infectivity  factor  (1  —  7).  It  is  assumed 
VRE  colonization  periods  last  from  weeks  to  months.  In  addition,  because  spontaneous 
decolonization  occurs  infrequently,  clearance  of  the  bacteria  is  not  considered  in  the 
model.  VRE  colonized  patients  are  not  treated  for  VRE  and  all  patients  on  admission  are 
swab  tested  for  VRE.  A  waiting  time  of  two  days  for  the  results  of  the  swab-test  cultures 
is  assumed  for  all  patients  in  any  time  period.  After  results  are  returned,  VRE  colonized 
patients  are  moved  into  isolation  (see  the  next  section  for  the  derivation  of  isolation 
rates).  As  a  simplification,  we  assume  that  the  total  number  of  patients  remains  fixed 
(i.e.,  overall  admission  rate  equals  overall  discharge  rate,  A  =  ^l\U  +  n2(C\  +  C2  +  J)), 
and  VRE  colonization  confers  no  additional  mortality.  Finally,  the  total  population  of 
patients  can  be  written  as  N  =  U  +  C\  +  C2  +  J. 
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Figure  1:  VRE  compartmental  model 


A  description  of  the  variables  and  parameters  used  in  our  model  are  given  in  Table 
1,  and  the  model  equations  (which  are  derived  in  the  next  section)  are  the  given  by 


dCi(t) 

dt 


dC2(t ) 
dt 

C2(tf+1) 
dJ(t ) 
dt 

A4+ 1) 


m{mN  +  {n2  -  iii)[C\{t)  +  C2(t)  +  J(i)]} 

me-2^{^N  +  (/i2  -  mi )[Ci(t  -  2)  +  C2(t  -  2)  +  J(f  -  2)]} 

M2C1W 

/5{iV  -  [Ci(t)  +  C2(t)  +  J(t)]}[Ci(t)  +  C2(t)  +  (1  -  j)J(t)} 

M2  C2(t),  for  ti  <t<  ti+i 

C2(tr+1)  -  C2{ti+ 1  -  2)e-2^ 

me~2^{niN  +  (m2  -  Mi)[Ci(t  -  2)  +  C2(t  -  2)  +  J(t  -  2)]} 

M2 </(£),  for  U  <  t  <  tl+\ 

J{tr+1)  +  C2{ti+ r-2)e-2^  (1) 


with  initial  conditions:  Ci(0)  =  Coi,  62(0)  =  Cq2,  J(0)  =  Jo,  and  a  trajectory  of  the 
solution  in  the  past:  Cj  (0)  =  T(0),  C2(0)  =  d^#),  J(0)  =  Q(6)  for  9  €  [—2,0). 
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Table  1:  Description  and  units  for  model  parameters  and  varaibles 


Variables 

Description 

Units 

u(t) 

Number  of  uncolonized  patients 

Individuals 

Ci{t) 

Number  of  VRE  colonized  patients 
through  admission 

Individuals 

c2(t) 

Number  of  VRE  colonized  patients 
during  hospital  stay 

Individuals 

J(t) 

Number  of  VRE  colonized  patients 
in  isolation 

Individuals 

Parameters 

Description 

Units 

A 

Patients  admission  rate 

Individuals  /  day 

m 

VRE  colonized  patients  on  admission  rate 

Dimensionless 

P 

Effective  contact  rate 

1/  day 

7 

HCW  hand  hygiene  compliance  rate 

Dimensionless 

Ai 

Uncolonized  patients  discharged  rate 

1/  day 

A2 

VRE  colonized  patients  discharged  rate 

1/  day 

4.1  Brief  derivation  of  the  model 

The  rate  of  change  in  the  total  population  of  C\  ’s  can  be  modeled  by  considering  the 
rate  of  admission  to  this  compartment  minus  the  rate  at  which  C\  ’s  are  isolated  and 
minus  the  rate  at  which  Ci’s  are  discharged  before  isolation.  It  is  assumed  that  swab 
cultures  are  administered  to  every  patient  that  is  admitted  (100%  swab  compliance)  and 
that  it  takes  two  days  for  the  test  results  to  be  returned.  Therefore,  two  days  after  a 
colonized  patient  has  been  admitted,  he/she  will  be  isolated  if  he/she  is  not  discharged 
over  the  waiting  period  of  two  days.  Thus,  the  isolation  rate  of  VRE  colonized  patients 
tested  on  admission  depends  on  the  past  history  of  these  patients.  Since  mA{t)  is  the 
admission  rate  of  VRE  colonized  patients,  then  mA(t  —  2)  is  the  rate  at  which  the  Ci’s 
were  admitted  at  time  t  —  2.  Some  of  those  admitted  at  time  t  —  2  may  be  discharged  at 
a  rate  //2  before  being  isolated  over  the  period  of  two  days.  Thus,  we  need  to  find  the 
fraction  of  those  admitted  at  time  t  —  2  that  are  still  admitted  two  days  after. 

We  consider  the  “cohort”  of  patients  who  were  all  admitted  at  t  =  0,  denoted  by 
C\  (0).  Let  Ci  (2)  denote  the  number  of  these  who  are  still  in  Ci  class  2  days  later.  If 
patients  leave  Ci  class  at  the  rate  /x 2  per  day,  then 

^^P  =  -M2C,(t)  (2) 
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with  initial  condition  Ci(0)  =  Cq\.  Hence 


=  e- -2*. 

cm 


(3) 


denotes  the  fraction  of  individuals  who  were  admitted  at  time  t  =  0  and  who  are  still 
admitted  at  time  t  =  2.  Thus,  the  rate  of  isolation  of  CVs  at  time  t  by  factoring  in 
discharges  over  the  period  of  two  days  is  mA(t  —  2)e-2/i2.  Then,  the  dynamics  of  CVs 
at  time  t  is  given  by 


=  mA(t)  —  mA(t  —  2)e  2M2  —  ^2Ci{t) 


(4) 


with  initial  condition  C\  (0)  =  C'o i .  Consequently,  the  rate  of  change  of  isolated  VRE 
colonized  patients  at  time  t  can  be  modeled  as 


CIIM1  =  mA(t  —  2)e  2/i2  —  H2 J{t) 


(5) 


which  is  the  rate  of  Ci’s  entering  to  J  class  minus  the  discharge  rate  in  unit  time  and 
with  initial  condition  J(0)  =  Jq. 

The  rate  of  change  in  the  total  population  of  CVs  can  be  modeled  by  considering 
the  rate  of  new  colonizations  during  hospital  stay  minus  the  rate  at  which  CVs  are 
isolated  and  minus  the  rate  at  which  CVs  are  discharged  before  isolation.  Weekly  swab 
cultures  were  administered  every  Tuesday.  Assuming  that  every  patient  is  tested  (100% 
compliance)  with  two  days  for  the  test  results  to  be  returned  (sometime  before  health 
care  worker’s  night  shift),  we  can  assume  that  VRE  colonized  patients  in  C2  class  will 
be  moved  into  isolation  every  Thursday  night.  Hence,  there  is  not  a  delay  involved  here 
but  rather  a  jump  discontinuity  every  Thursday. 

We  let  ti  be  a  Thursday  and  tl+\  be  the  next  Thursday.  We  can  then  expect  a  jump 
discontinuity  in  the  number  of  patients  in  C2  class  as  well  as  in  the  number  of  patients  in 
J  class  at  time  C+i-  If  C2{tf+1 )  represents  the  number  of  patients  in  C2  after  isolation, 
then 


C2(tf+1)  =  C2(t-+1 )  -  C2(ti+1  -  2)e-2^2.  (6) 

Here  C2{ti+\  —  2)  represents  the  number  of  patients  in  C2  class  that  were  tested  on 
Tuesday  and  e~2f12  is  the  fraction  of  those  C2(U+ 1  —  2)  that  were  not  discharged  over  the 
period  of  two  days  before  isolation  (it  follows  from  the  same  derivations  as  represented 
in  (3)).  This  jump  discontinuity  in  C2  influences  a  jump  in  J  defined  as 

J(tf+ 1)  -  J(t~+1)  =  C2{ti+i  -  2)e~2^.  (7) 

The  number  of  isolated  patients  after  isolation,  J(tf+1),  can  be  determined  by  the  number 
of  patients  in  J  class  before  isolation  plus  the  number  of  patients  in  C2  class  that  were 
isolated  at  time  Therefore,  the  dynamics  for  compartment  C2  and  J  at  time  t  for 


(3U(t)  [Cxit)  +  C2(t)  +  (1  -  7).J(t)]  -  fi2C2(t)  (8) 

mA(t  —  2)e-2/i2  —  /. i2J(t ),  (9) 

with  initial  condition  C2( 0)  =  Cq2,  J( 0)  =  Jo-  The  rate  of  change  for  C2  class  is  modeled 
by  the  rate  of  new  colonizations  minus  the  rate  of  discharged  in  unit  time.  The  rate  of 
change  of  J  class  is  modeled  by  the  rate  of  Ci’s  entering  J  minus  the  discharge  rate  in 
unit  time. 

Finally,  we  assume  the  overall  admission  rate  equal  to  the  overall  discharge  rate,  i.e., 
A (t)  =  ii\U(t)  +  n2[C\{t)  +  C2(t)  +  J(t)].  In  order  to  keep  the  overall  rate  of  admission 
equal  to  the  overall  rate  of  discharge,  we  replace  U{t)  =  N  —  [ C\{t )  +  C2(t)  +  J(t )] 
obtaining  the  system  of  equations  given  in  (1). 

4.2  Parameters  estimated  directly  form  the  surveillance  data 

Infection  control  measures  were  implemented  in  the  form  of  health  care  worker  hand- 
hygiene  before  and  after  patients  contact  by  the  use  of  gloves  and  gowns,  and  washing 
of  hands.  For  this  study  we  consider  the  health  care  worker  before  patient  contact 
compliance  of  57.56%  as  a  better  estimator  for  the  parameter  7.  In  the  oncology  unit 
VRE  colonized  patients  had  a  mean  length  of  stay  of  13.15  days  (std=18.28)  compared 
with  6.27  (std=6.80)  for  the  uncolonized  patients.  These  means  are  statistically  signif¬ 
icantly  different  (p-value  <  0.0001),  supporting  the  assumption  of  different  discharge 
rates.  Hence,  we  take  1/ni  =  6.27  and  I/V2  =  13.15  giving  /xi  =  0.16  and  /x 2  =  0.08. 

In  an  attempt  to  estimate  the  fraction  m  of  patients  that  are  colonized  on  admission, 
we  found  inconsistencies  in  the  reported  prevalence  of  VRE  on  admission  (the  summaries 
of  admitted  patients  did  not  match  the  actual  data).  In  estimating  the  initial  conditions 
(C01,  C02,  Jo)  from  the  data  reported  on  the  first  day  of  data  collection  (January  3,  2005), 
only  the  number  of  VRE  colonized  patients  in  isolation  is  reported.  Hence,  the  initial 
conditions  for  C\  and  C2  classes  cannot  be  easily  estimated.  Another  parameter  that  is 
of  interest  and  can  not  be  estimated  directly  from  the  data  is  the  VRE  transmission  rate 
f3.  As  a  result,  the  fraction  of  patients  that  are  colonized  on  admission  and  the  trans¬ 
mission  rate  are  estimated  using  inverse  problem  methodology.  In  Table  2  we  present 
the  estimated  and  assumed  values  of  the  parameters  and  initial  conditions  required  for 
inverse  problem  calculations. 

5  Numerical  implementation 

The  solutions  to  the  system  (1)  can  be  simulated  using  an  algorithm  developed  by 
Banks  and  Kappel  [14]  and  extended  for  nonlinear  systems  [5,  6,  28]  (see  also  [7,  8]  for 
applications  of  this  algorithm).  The  idea  behind  the  algorithm  is  to  first  represent  the 
system  as  an  infinite  dimensional  abstract  evolution  equation  (AEE)  and  then  consider 


ti  <  t  <  ti- |_i  are  given  by 

dC2(t) 

dt 

dJ(t) 

dt 
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Table  2:  Parameters  and  initial  conditions  values  (values  assumed  are  used  for  optimiza¬ 
tion  purposes) 


Initial  Conditions 

Oncology  Unit  (N=37) 

Units 

Source 

[Coi,  C02,  Jo] 

[1,2,3] 

Individuals 

[assumed, assumed, data] 

m 

[0,1] 

Individuals 

assumed 

T(0) 

[2,2] 

Individuals 

assumed 

fi(0) 

[4,4] 

Individuals 

data 

Parameters 

A 

hi  U(t)  +  +  J  (f)) 

Individuals/day 

- 

m 

0.04 

Dimensionless 

Assumed 

p 

0.001 

1/  day 

Assumed 

7 

0.58 

Dimensionless 

data 

a 

0.29 

1/  day 

data 

hi 

0.16 

1/  day 

data 

h2 

0.08 

1/  day 

data 

approximations  in  a  space  spanned  by  piecewise  linear  splines.  An  approximation  to 
the  solution  of  system  (1)  is  obtained  by  calculating  numerically  the  generalized  time 
dependent  Fourier  coefficients  of  the  approximate  solutions  corresponding  to  the  spline 
representation.  With  these  coefficients  we  can  recover  an  approximation  to  the  solution 
of  the  system.  This  approach  provides  a  rigorous  framework  in  which  global  existence  and 
uniqueness  of  the  solution  along  with  convergence  proofs  can  be  given  (see  [5,  6,  8,  14,  28] 
for  proofs). 

5.1  Abstract  evolution  equation  formulation 

We  use  the  notation  of  [8,  14].  Let  x(t)  represent  the  state  of  the  system  (1)  at  time  t., 
that  is 


and  represent  the  function  space  state  of  the  system  due  to  the  delay  as 

xt(r)  =  x(t.  +  r),  —2  <  r  <  0. 

We  define  Z  =  M3  x  L2(- 2,  0;  M3)  as  the  infinite  dimensional  Hilbert  space  with  norm 

,o  \  1/2 

+  J  ^  \<Kr)\2dTj  , 
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where  ( rj ,  <p)  €  Z  and  L 2  is  the  space  of  square  integrable  functions,  and  inner  product 


(Ob  0),  (C,  $))z  =  VTC  +  J  4>(t )T  'ppr )dr 


for  (r?,  0),  ( C,0)  G  Z. 

Then  the  system  (1)  can  be  written  as 

-  =  L(x(t),xt )  +  /(x(t))  +5  for  ti  <t  <  ti+i, 

(i(O),io)  =  ($(0),$)ez,  (10) 

where  t  >  0  and  £  C(— 2, 0;  M3)  is  the  trajectory  history  function  of  the  system  defined 
on  [-2,0]. 

In  the  model  (10),  g  is  the  state  independent  part  of  the  system,  L(x(t),xt)  is  the 
linear  part  of  the  system,  and  f(x(t ))  is  the  nonlinear  part  of  the  system.  If  we  let  <5_2(t) 
be  the  Dirac  delta  ‘density’  (corresponding  to  a  Heaviside  distribution  with  a  unit  jump 
at  t  =  —2),  we  have 


9  = 


mN ni(l  —  e  2m2) 
0 

mN  gie~2f12 


m(n2  -  /xi)  -  H2  m(//2-Mi)  m(/z2  -  Mi) 
@N  pN  -  M2  /3iV(l  -  7) 

0  0  -M2 


+  ?ne  2/i2(M2~Mi) 


i-H 

1 

1 _ 

-1 

1 

1 — 1 

_ 1 

/*( 

0 

0 

0 

/ 

1 

1 

1 

J- 

(/>(r)d_2(r)(ir 


/(»7) 


0  0  0 

— /3bn  +  2r?2  +  (2  -  j)m\  -Pirn  +  (2  -  7)%]  -p{i-^)m 

0  00 


v- 


With  respect  to  the  nonlinear  terms  in  /(17),  a  more  realistic  model  requires  that  these 
terms  be  bounded  in  the  limit  (i.e.,  saturation  should  be  considered  in  the  nonlinear 
terms  so  that  in  the  limit  it  is  at  least  affine  in  x\  or  X2  or  X3).  For  well  posedness 
considerations  the  nonlinear  terms  can  be  replaced  by  a  function  such  as: 


Pi(Xi) 


0  Xi  <  0 
Pxi  0  <  Xi  <  Xi 
Pxi  Xi  <  Xi 
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with  Xi  €  M+  as  finite  upper  bounds  and  i  =  1,2, 3.  Then  f(jj)  can  be  replaced  by 


fiv) 


0 


~  2h(m)  -  (2  -  7)A(%) 
0 


0 

-him)  -  (2  -  i)him) 

o 


o 

-(i  -i)him) 
o 


which  yields 

dx(t ) 
dt 

(x{0),xQ) 


L(x(t),xt)  +  f(x(t))  +  5  for  U<t<  ti+ 1, 
($(0),$)  G  Z. 


(11) 


5.2  Abstract  evolution  equation  implementation 

We  define  a  nonlinear  operator  A  :  P(A)  C  Z  — >•  Z  by 

A(^>(0),  0)  =  ^L(0(O),  (j>)  +  /(<X  0)), 

with  domain  defined  as 

P(A)  =  {(0(0),^)  G  Z\<j>  €  H1(—2,  0;  M3)}. 

where  neither  the  nonlinear  operator  A  nor  the  domain  T>(A)  depends  on  t.  If  we  let 
z(t)  =  (x(t),xt)  G  Z,  then  the  delay  system  (11)  can  be  formulated  as 

z(  0)  =  z0.  (12) 

Define  ZM  to  be  the  approximating  piecewise  linear  spline  [14,  15]  subspace  of  Z .  PM 
as  the  orthogonal  projection  of  Z  onto  ZM ,  and  AM  as  the  approximating  operator  of  A 
given  by  AM  =  PhI  APM  (again,  see  [8,  14]).  Then  the  system  (12)  can  be  approximated 
by  the  finite  dimensional  system 

^M  =  AMzM(t)  +  PM(g,  0) 
zM{  0)  =  PMz0.  (13) 

We  define  a  basis  and  representation  for  ZM .  By  partitioning  the  interval  [—2,0] 
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with  t Y  =  —j(2/M)  for  j  =  0,  we  can  define  a  basis  /3M  by 

PM  =  (PM(0),PM)  where  pM  =  (e^,  ef , e%)  <g>  I3, 
and  an  element  in  can  be  written  as 

M 

ZM  =  pMaM  =  ^(eM(0),eM)aM; 

i= o 

with  cij 1  £  RM  and  the  e^’s  are  piecewise  linear  functions  with  ejf  (tf^)  =  5ij  for 

Define  as  a  matrix  representation  of  AM  restricted  to  the  subspace  ZM  and  let 
wM(t )  and  Km  be  defined  such  that  zM  (t)  =  j3MwM(t)  and  PM(g,  0)  =  j3MK.  Then, 
solving  for  zM  (t)  in  the  finite  dimensional  system  (13)  is  equivalent  to  solving  for  wM(t) 
in  the  system 

^M  =  AMwM(t)  +  KM 

wM(0)=io™,  (14) 

where  =  PM zq.  We  remark  that  based  on  previous  theory-see  [5,  6,  28],  having 

obtained  wM ,  the  product  j3MwM(t )  converges  uniformly  in  t  to  the  solution  of  (12). 

In  order  to  approximate  PM(rj ,  (f>)  for  any  (rj,  (f>)  £  Z,  where  PM(r],  q i>)  is  the  orthog¬ 
onal  projection  of  (?/,  (j))  £  Z  onto  ZM,  assume  PM(rj,(/))  =  j3MuM  where  uM  £  R3 
then 


0  =  0 MuM A)  JM)Z 

which  implies  that 

0MJM)uM  =  0M,(vA))z.  (15) 

The  orthogonal  projection  PAI  is  uniquely  determined  by  solving  (15)  for  uM  and  implies 
that 

KM  =  (0MJM)z)-10MAg,  0))z 

and 

w, oM  =  (0M 0M)z)~10M ,  (x(0),xq))z- 
Similarly,  in  order  to  approximate  AMaM  for  any  aM  £  R3,  we  observe  that 
AMPMaM  =  PM(Af3MaM)  =  f3MA™aM 

and 

PM(APMaM)  =  PM (L((3M (0)aM , /3M aM)  +  f0M{O)aM),  £0MaM)). 

Thus 
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0  =  pMA™aM  -  PM (L(f3M (0)aM , (3M aM)  +  f0M(O)aM),  -^0MaM)) 


and 

0  =  0MJMA™aM  -  (L((3M  (0)aM ,  (3M  aM)  +  f{/3M(0)aM),  £{PMaM)))z, 
which  implies  that 

0MJM)z(A^aM)  =  0M ,  (L(/3M (0)aM , /3M aM)  +  /(/?M(0)aM),  ^(/3 MaM)))z. 

(1 

Therefore,  A0  aM  is  uniquely  defined  by  solving  (16)  and  implies  that 

A0wM(t)  =  ( 0M0M)z)~1QMwM(t ) 

where 

qm  =  (f(0)]ij(f(0))f)  +  /(f(0)))  +  0M0M). 


5.3  Convergence  of  solutions 

Before  performing  an  inverse  problem  using  the  discrete  event  model  with  delay  (1)  one 
needs  to  know  how  many  partitions  M  to  take  in  the  interval  [-2,  0]  so  that  the  solutions 
of  (13)  are  close  to  the  solutions  of  (12)-  we  are  guaranteed  convergence  as  M  — >  oo. 
In  order  to  do  this,  we  carried  out  simulations  for  the  forward  problem  for  increasing 
values  of  M  with  the  parameter  values  presented  in  Table  2. 

We  used  Matlab’s  ode 45  to  solve  system  (14)  with  M  =  12,  24, 48, 96.  In  Figure  2  we 
present  the  solutions  for  each  M  value.  Note  that  solutions  corresponding  to  the  M  = 
24, 48,  96  values  are  very  close.  However,  these  results  coupled  with  the  computational 
times  required  in  solving  the  system  on  the  time  interval  t  =  [0, 100]  (see  Table  3)  suggest 
that  any  M  between  24  and  48  appears  to  be  a  reasonable  choice  for  this  system.  We 
remark  that  the  computational  time  increases  by  a  factor  of  12  when  we  increase  the 
partitions  from  M  =  48  to  M  =  96.  As  a  result  of  these  investigations,  we  chose  M  =  30 
for  our  continuing  considerations. 
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Table  3:  Computational  times  required  to  solve  the  discrete  event  model  with  delay  (1) 
for  increasing  number  M  of  partitions  using  oncology  unit  parameters. 


M  (partitions) 

Time  (s) 

12 

1.99 

24 

18.17 

48 

194.44 

96 

2,402.10 

Figure  2:  Solutions  of  the  discrete  event  model  (1)  with  delay  for  increasing  M  using 
the  oncology  unit  parameters 


6  Inverse  Problem 

6.1  Least  squares  theory 

We  next  consider  a  least  square  formulation  of  a  generic  inverse  problem  for  a  vector  9 
dependent  system 


x0,  (17) 

with  parameter  vector  9  €  Mp,  x(t)  =  (xi(t), ...,  £jv(t))r  €  M.N ,  and  an  observational 
process  y(tj )  =  Cx{tj\9 )  €  Mm  for  j  =  1  where  C  is  an  m  x  N  matrix.  The 


dx(t ) 
dtt 

x(to)  = 


15 


mathematical  model  is  assumed  to  be  well-posed  (i.e.,  existence  of  a  unique  solution 
that  depends  smoothly  on  the  parameters  and  initial  data). 

Let  Yj,  for  j  =  be  longitudinal  data  observations  corresponding  to  the  ex¬ 

perimental  data  of  the  observational  process.  Since  in  general  Yj  is  not  assumed  to  be 
free  of  error  (i.e.,  error  in  the  data  collection  process),  Yj  will  not  be  y(tj).  We  can 
thus  envision  experimental  data  as  generally  consisting  of  observations  from  a  “perfect 
model”  plus  an  error  component  represented  by  the  statistical  model 

Yj  =  f  (tj ;  90)  +  Sj  for  j  =  1, ..,  n,  (18) 

where  9q  corresponds  to  the  “true”  parameter  that  would  generate  error  free  observations 
{Yj}.  The  function  f(tj,8)  corresponds  to  the  observation  process  for  model  (17)  and 
depends  on  the  parameters  9  in  a  nonlinear  fashion. 

The  £j' s  are  random  variables  and  we  make  the  following  standard  assumptions  on 
our  statistical  model:  (i)  The  measurement  errors  £j  for  j  =  1,  ..,n  have  mean  zero,  i.e., 
E(£j)  =  0;  (ii)  The  measurement  errors  £j  for  j  =  l,..,n  have  the  same  variance,  i.e., 
var(£j)  =  0q  <  oo,  and  are  independent  identically  distributed  (iid)  random  variables 
for  all  tj. 

If  the  error  distribution  is  unknown,  an  ordinary  least  squares  (OLS)  optimization 
procedure  is  often  employed  [9].  This  method  yields  the  estimator 

n 

Ools  =  Ools  =  arg  min0€©  “  /(*i>  e)\2-  (19) 

3= 1 

This  corresponds  to  solving  for  9  in 

n 

J2[Yj-f(tj,9)}Vf(tj,9)  =  0. 

3= 1 

It  is  of  interest  to  know  the  distribution  of  9ols  and  for  this  we  have  the  asymptotic 
theory  (for  details  see  [9,  24,  33])  which  yields  that  as  n  —>  oo 

Ools  =  Ools  ~  NP(9o,X %),  (20) 

where  the  covariance  matrix  £[]  is  dehned  by 

=  a20[nn  q]-1 


and 

Q0  =  lim„^oo-Xn(^o)TXn(^o)- 
n 

Here  yn(0)  =  {Xjk}  is  the  sensitivity  matrix  defined  by 


Xjk{9) 


df(tj,9 ) 

d9k 


j  =  l,...,n  and  k  =  l,...,p. 
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Given  Ools ,  the  estimate  corresponding  to  a  specific  realization  {yj}  of  {Yj},  the  error 
variance  CTq  is  approximated  by 


aOLS  ~ 


1  n 


n  —  p 


(21) 


l=i 

2 


which  is  the  bias  adjusted  estimate  for  erg.  The  covariance  matrix  Eg  is  approximated 

by 

^ols  =  ^ols[xT0ols)x0ols)]  1-  (22) 

Therefore  as  n  — >•  oo 


@ols  ~  NpiOo,  Eg)  «  Mp{6ols ,  S ols )• 


(23) 


If  the  error  distribution  is  unknown  and  the  assumption  of  constant  variance  of  the 
error  in  the  longitudinal  data  does  not  hold,  we  need  to  formulate  a  new  statistical  model 
to  take  into  consideration  the  non-constant  error  variability.  In  this  case  a  generalized 
least  square  (GLS)  optimization  procedure  may  be  more  appropriate.  If  we  can  assume 
that  the  size  of  the  error  depends  on  the  size  of  the  observed  quantity,  the  statistical 
model  (i.e,  a  relative  error  model)  is  given  by 


Yj  =  f(tj,e0)(l  +  £j)  for  j  =  1, ..,  n,  (24) 

where  under  assumptions  (i)-(ii)  we  have  Yj  Ar(f(tj,0o),<Jof2(tj,do)).  In  this  case, 
GLS  can  be  viewed  as  minimizing  the  distance  between  the  data  and  the  model  while 
taking  into  account  unequal  quality  of  the  observations.  The  GLS  method  defines  the 
estimator  9  gls  as  the  solution  of  the  normal  equations 

n 

Yj-\tj,eGLs)[Y3  -  f(tj,0GLs)]Vf(tj,9GLs)  =  0.  (25) 

1=1 

For  motivation  underlying  this  definition  see  [9,  24],  The  idea  is  to  assign  to  each  model 
dependent  observation  a  weight  that  reflects  the  uncertainty  in  that  observation.  From 
the  corresponding  asymptotic  theory  we  have  as  n  — >•  oo 


where 


with 


&GLS  =  9 gls  ~  ^(*0,  so) 


Sq  ~  &o[FT (0q)W (Oq)F(9o)}~1 


m 


df(t  i,fl) 
80! 


9f(tn,e) 
ae  i 


df(ti,0)  j 

aep 


8f(tn,0) 

aep 


(26) 
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and  W  l(0)  =  diag(f2(ti,8), ...,  (f2(tn,0)).  For  the  estimates  we  have  the  covariance 
matrix  approximation  ( Ogls  is  the  estimate  corresponding  to  a  realization  of  {Y.j) 

So  «  s; nGLS  =  ^gls[Ft(0gls)W(8gls)F(8gls)}-\  (27) 

and  the  error  variance  approximation 


aGLS  ~ 


n  — 


-E — - — 

P  f2  (tj  j  &GLS  ) 


I Yj  ~  f(tj,§GLs)Y 


Therefore  we  approximate  in  the  asymptotic  theory  by 

Ogls  ~  Np{0\ o,  Sq)  «  Np(6gls ,  £gls)- 


(28) 


(29) 


We  can  also  calculate  the  standard  errors  for  9gls  by  taking  the  square  roots  of  the 
diagonal  elements  of  the  covariance  matrix  Tj7qLS. 


6.2  Inverse  problem  results 

We  present  results  of  estimating  0  =  (f3)  and  0  =  using  the  model  (1).  We 

demonstrate  the  numerical  and  statistical  capabilities  of  our  model  and  associated  inverse 
problem  algorithms  with  simulated  “data”  in  Section  8. 

As  we  mentioned  before,  the  VRE  surveillance  data  does  not  report  the  number 
of  patients  in  isolation  during  weekends  and  holidays.  Not  surprising,  when  trying  to 
fit  the  model  (1)  to  the  data,  we  found  that  there  were  in  many  periods  additional 
observations  missing  that  did  not  correspond  to  weekends  and  holidays.  However,  we 
identified  a  period  of  about  three  months  (January  17,  2006  to  April  13,  2006)  that  only 
had  weekends  as  missing  values  and  used  these  to  carry  out  the  inverse  problems.  Since 
our  model  is  continuous  through  weekends,  in  order  to  carry  out  the  fitting  we  omitted 
any  weekend  observations  from  the  cost  functional. 

Residuals  (see  [9]  for  a  discussion  of  the  use  of  residual  plots  in  post  inverse  problem 
analysis)  and  the  best  fit-to-data  as  a  result  of  estimating  0  =  (/3)  are  depicted  in  Figure 
3  and  Figure  4.  Results  from  estimating  0  =  (rn.  f3)  are  given  in  Figure  5  and  Figure 
6.  Overall,  from  the  residual  analysis  we  can  conclude  that  whether  we  use  OLS  or 
GLS,  the  errors  are  independent  (residuals  vs  time  are  random).  On  the  other  hand,  the 
residuals  versus  model  plot  for  OLS  compared  to  the  residuals/model  versus  model  for 
GLS  have  the  same  pattern  indicating  no  difference  in  whether  we  use  an  absolute  error 
model  or  a  relative  error  model.  Given  the  wide  variability  in  the  VRE  data,  we  are  led 
to  suspect  that  the  apparent  discrepancy  in  the  statistical  model  may  be  due  to  model 
error  rather  than  measurement  error. 
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(a)  OLS:  Residual  vs  Time 


(c)  GLS:  Residual/Model  vs  Time 


(b)  OLS  Residual  vs  Model 


(d)  GLS:  Residual/Model  vs  Model 


Figure  3:  Residual  analysis  for  the  OLS  and  GLS  optimization  with  model  (1)  as  a  result 
of  estimating  6  =  (j3)  using  oncology  unit  surveillance  data. 


Figure  4:  Best  fit  model  solutions  (model  (1))  to  oncology  unit  surveillance  data  via 
OLS  optimization,  /3  =  0.0039.  At  each  jump,  data  is  fit  using  the  model  value  after 
isolation. 


19 


Figure  5:  Residual  analysis  for  the  OLS  and  GLS  optimization  as  a  result  of  estimating 
0  =  (m,  (5)  using  oncology  unit  surveillance  data. 


Figure  6:  Best  fit  model  solutions  (model  (1))  to  oncology  unit  surveillance  data  via 
OLS  optimization,  (m,  f3)  =  (0.1332,0.008).  At  each  jump,  data  is  fit  using  the  model 
value  after  isolation. 
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7  Model  refinement 


The  inverse  problem  results  from  the  previous  section  using  the  model  (1)  suggests  that 
the  model  fit  does  not  agree  particularly  well  with  the  underlying  process.  Therefore,  we 
refined  the  model,  allowing  more  realistic  details  in  the  model,  and  compared  it  again 
to  the  data  to  investigate  if  there  were  possible  improvements  in  the  model  fit. 

Details  that  we  included  in  the  model  are  based  on  a  faithful  description  of  hospital 
routine  for  patients  that  are  VRE  colonized  on  admission.  The  epidemiology  laboratory 
is  closed  on  weekends  and  this  results  in  two  consequences.  First,  pending  test-results 
from  patients  admitted  on  a  Thursday  or  a  Friday  will  be  back  by  Monday  or  Tuesday. 
Second,  swab-tests  taken  on  patients  admitted  on  a  Saturday  and  a  Sunday  will  be 
sent  to  the  epidemiology  laboratory  on  Monday,  then  these  patients  will  be  isolated  by 
Wednesday.  For  simplicity,  we  focused  on  the  first  detail  to  see  if  accurate  rendering  of 
these  details  in  the  model  provides  any  improvement  in  the  model  fit. 

We  assume  that  for  patients  that  are  admitted  on  Thursdays  and  Fridays,  the  test- 
results  are  back  on  Tuesdays.  These  patients  will  be  moved  into  isolation  by  Tuesday 
night.  As  a  result,  we  also  have  a  jump  discontinuity  every  Tuesday.  If  we  let  tj  be  a 
Tuesday  and  tj+ 1  be  the  next  Tuesday  then 

Ci(if+1)  =  Cj (ij+1)  -  Cj(ii+1  -  4)e-4^  (30) 

represents  the  number  of  patients  in  Cj  after  isolation  on  Tuesday.  This  is  basically  the 
number  of  patients  in  compartment  Cj  before  isolation  minus  the  number  of  patients 
that  were  isolated  at  time  tj+ Hence,  Cj  (tj+i  —  4)  represents  the  number  of  patients 
in  Cj  on  a  Friday  (this  includes  the  ones  that  where  admitted  on  a  Thursday  and  have 
not  been  discharged  by  Friday)  and  e-4/i2  is  the  fraction  of  those  Cj  (j/+i  —  4)  that  were 
not  discharged  over  a  period  of  4  days  before  isolation  (this  follows  the  same  derivation 
as  in  (5.4)).  Consequently,  this  jump  discontinuity  in  Cj  influences  a  positive  jump  in 
J  defined  as 


J(%+1)  =  +  Cj(f,+1  -  4)e-4^,  (31) 

where  J(tj+1)  represent  the  total  number  of  patients  in  isolation  after  that  the  isolation 
on  Tuesday  takes  place.  The  dynamics  for  compartment  Cj  at  time  t  are  modeled  by 

m{niN  +  {n2  -  £ii)[Cj(i)  +  C2{t)  +  J(t)]} 

+  (M2  -  //i)[Cj(t  -  2)  +  C2(t  -  2)  +  J(t  -  2)]} 

M2Cj(fi  for  tj  <  t  <  tSau  (32) 

m{mN  +  (n2  -  m)[Ci(t)  +  C2(t)  +  J(t)]} 

I~l2C\ (f)  for  tF  <t  <  tj+ 1,  (33) 

with  tp  as  day  Friday  and  tsat  as  day  Saturday.  Equation  (32)  models  the  rate  of 


rfCj(t) 

dt 


dC-iit) 

dt 
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change  of  Ci’s  from  Wednesday  through  Friday  in  which  isolation  takes  place  by  the 
corresponding  rate.  On  the  other  hand,  Equation  (33)  models  the  rate  of  change  of  Ci’s 
from  Saturday  through  Tuesday  in  which  isolation  is  not  employed.  Tuesday  is  included 
on  the  interval  because  it  corresponds  to  the  number  of  Ci’s  on  Tuesday  before  isolation 
takes  place.  Finally,  the  complete  model  that  includes  the  new  detail  is  given  by 

=  m{iiiN  +  (n2  -  ii\)[C\(t)  +  C2(t )  +  J(t)]} 

-  me~2112 {niN  +  {n2  -  m)[Ci(t  -  2)  +  C2(t  -  2)  +  J{t  -  2)]} 

-  H2C\{t)  for  tj  <  t  <  tSat, 

=  m{ii\N  +  {n2  -  +  C2(i)  +  J(t)]} 

-  /J2C1  (t)  for  tF  <  t  <  tj+ 1, 

=  C'1(t7+1)-C'1(fJ-+1-4)e-4^, 

=  /3{N  -  [Ci(t)  +  C2(t)  +  J(t)]}[Ci(t)  +  C2(f)  +  (1  -  7)  J(t)] 

-  H2C2(t),  for  ti  <t  <  ti+ !, 

=  C2(t-+1)~C2(ti+1-2)e-2^, 

=  me~2^2 {niN  +  (H2  -  -  2)  +  C2(t  -  2)  +  J(t  -  2)]} 

-  ^2  J(t),  for  tj+i  <  t  <  tSat , 

=  — h2  J(t),  for  tF  <t  <  tj+ 1, 

=  ^7+1)  +  C'1(ii+1-4)e-4^, 

=  J(tr+1)  +  C2(ti+  i-2)e-2^.  (34) 

with  tF  as  day  Friday,  tsat  as  day  Saturday,  tj  as  isolation-day  on  Tuesday,  t,l  as  isolation- 
day  on  Thursday.  Initial  conditions  are:  Ci(0)  =  C01,  C2(0)  =  Co2,  J( 0)  =  Jo,  and  a 
trajectory  of  the  solution  in  the  past:  C\{9)  =  T(0),  C2(0)  =  T(0),  J(6)  =  17(0)  for 
9  G  [-2,0). 

7.1  Inverse  problem  results 

In  this  section  we  present  the  results  of  estimating  8  =  (/3)  using  the  refined  model  (34) 
with  the  surveillance  data.  We  also  verify  the  numerical  accuracy  of  the  computations 
for  the  refined  model.  Section  8  contains  details  on  numerical  validation  as  well  as  model 
comparison  details. 

Figure  7  depicts  the  residual  analysis  and  Figure  8  depicts  the  best  fit  to  data  for 
model  (34)  .  The  results  indicate  that  there  is  not  significant  improvement  in  the  model 
fit  to  the  data.  That  is,  the  additional  detail  in  the  model  corresponding  to  jumps  on 
Tuesdays  does  not  have  a  significant  effect  in  the  fit  to  the  data. 


rfCi(f) 

dt 


dC^t) 

dt 

C'i(tJn) 

dC2(t) 

dt 

C2(tf+ 1) 
dJ(t ) 
dt 

dJ(t ) 
dt 

A4+ 1) 
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Figure  7:  Refined  model:  Residual  analysis  for  the  OLS  and  GLS  optimization  with 
model  (34)  as  a  result  of  estimating  6  =  (f3)  using  oncology  unit  surveillance  data. 


Figure  8:  Refined  model:  Best  fit  model  solutions  to  oncology  unit  surveillance  data  via 
OLS  optimization  with  model  (34),  {3  =  0.0039.  Note  jumps  every  5  days  corresponding 
to  every  Tuesdays  and  jumps  every  7  days  corresponding  to  Thursdays.  At  each  jump, 
data  is  fit  using  the  model  value  after  isolation. 
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8  Models  comparison  and  algorithm  analysis 


We  next  report  on  comparison  of  model  (1)  with  model  (34)  as  well  as  validate  the 
numerical  accuracy  of  these  models  when  carrying  out  an  inverse  problem. 

In  order  to  compare  both  models,  we  simulated  a  solution  with  the  parameters  values 
recorded  in  Table  4.  Figure  9  contains  a  plot  of  the  forward  solutions  of  the  models  in 
comparison  to  the  data.  The  plots  indicate  that  relative  to  the  data  the  dynamics  of 
model  (1)  in  comparison  to  model  (34)  are  essentially  the  same.  In  the  first  14  days 
we  see  a  little  difference  but  after  that  there  is  no  qualitative  difference  except  for  the 
Tuesdays’  jump  in  model  (34).  We  can  conclude  that  with  the  addition  in  model  (34) 
of  more  details  to  model  (1)  there  appears  to  be  no  significant  improvement  over  model 
(1)  fits  to  the  surveillance  data. 

In  order  to  test  the  computational  efficacy  of  use  of  the  models  in  inverse  problem 
calculations,  we  added  noise  (constant  variance  error)  to  the  forward  solution  and  carried 
out  an  inverse  problem  using  the  noise-added  solution  as  synthetic  data  to  attempt  to 
recover  the  original  parameters.  In  other  words,  realizations  of  synthetic  data  {yj}  for 
j  =  1  ,...,n,  are  constructed  by  adding  variability  to  the  model  solution,  f(tj,8o)  = 
J(tj,9o).  The  statistical  model  that  captures  the  variability  is 


Yj  =  f{tj,80)  +  crZj  (35) 

where  Zj  is  a  standard  normal  variable  (i.e. ,  Zj  ~  AT(0, 1))  and  a2  is  the  constant 
variance.  The  magnitude  of  cr  determines  the  amount  of  noise  added. 

We  conducted  the  inverse  problems  to  estimate  6  =  (/3)  and  9  =  ( m ,  f3 )  via  ordinary 
least  squares  (OLS)  with  different  noise  levels:  a  =  0,0.10,0.30,0.50  and  0.70.  In 
Tables  5  and  6  we  summarize  the  results  for  the  estimation  of  9  =  (/3)  and  9  =  (m,  (5) 
corresponding  to  model  (1)  and  model  (34).  Fitting  results  are  presented  in  Figures  10 
and  11  for  model  (1),  and  Figures  12  and  13  for  model  (34).  We  can  conclude  that  the 
estimation  procedure  performed  adequately  with  each  model  when  using  noisy  synthetic 
data. 
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Table  4:  Parameters  values  used  in  the  forward  type  problem  to  compare  models 


Initial  Conditions 

Units 

Oncology  Unit  Values 
(N=37) 

z(0) 

Individuals 

[2;4;11] 

x0 

Individuals 

[4,2;5,3;9,10] 

Parameters 

A 

Individuals/day 

0.16C/(t)  +  0.08(C(t)  +  J(f)) 

m 

Dimensionless 

0.4 

p 

1/ day 

0.0039 

7 

Dimensionless 

0.58 

a 

1/ day 

0.29 

hi 

1/ day 

0.16 

h2 

1/ day 

0.08 

Table  5:  OLS  optimization  testing  using  Model  (1)  for  parameter  subsets  0  =  (/3)  and 
6  =  (m,  j3).  The  model  was  fit  to  generated  data  with  a  =  0, 0.1, 0.3,  0.5, 0.7. 


a 

0 

0.1 

0.3 

0.5 

0.7 

m 

- 

- 

- 

- 

- 

p 

0.0039 

0.0039 

0.0040 

0.0041 

0.0041 

m 

0.04 

0.0392 

0.0386 

0.0364 

0.0334 

p 

0.0039 

0.0040 

0.0040 

0.0042 

0.0043 

Table  6:  OLS  optimization  testing  using  Model  (34)  for  parameter  subsets  0  =  (/3)  and 
0  =  ( m .,  j3).  The  model  was  fit  to  generate  data  with  cr  =  0,  0.1, 0.3,  0.5, 0.7. 


<T 

0 

0.1 

0.3 

0.5 

0.7 

m 

- 

- 

- 

- 

- 

p 

0.0039 

0.0039 

0.0040 

0.0040 

0.0041 

m 

0.04 

0.0404 

0.0407 

0.0424 

0.0449 

p 

0.0039 

0.0039 

0.0040 

0.0040 

0.0040 
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(a)  Model  (1)  solution  using  forward  type  problem  in  comparison  to  the 
data.  Note  jumps  every  7  days  corresponding  to  Thursdays.  Both  values 
at  the  jump  are  plotted. 


(b)  Model  (34)  solution  using  forward  type  problem  in  comparison  to  the 
data.  Note  jumps  every  5  days  corresponding  to  Tuesdays  and  jumps 
every  7  days  corresponding  to  Thursdays.  Both  values  at  the  jumps  are 
plotted. 

Figure  9:  Solutions  of  the  models  using  forward  type  problem  in  comparison  to  the 
oncology  unit  surveillance  data. 
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VRE  Patients  in  Isolation  VRE  Patients  in  Isolation 


(c)  Level  of  noise:  er=0.50 


(b)  Level  of  noise:  <7=0.30 


(d)  Level  of  noise:  <7=0.70 


Figure  10:  Model  (1)  fit  to  the  synthetic  data  using  OLS  optimization  procedure  to 
estimate  0  =  ((5). 
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VRE  Patients  in  Isolation  VRE  Patients  in  Isolation 


(a)  Level  of  noise:  <7=0.10  (b)  Level  of  noise:  <7=0.30 


Figure  11:  Model  (1)  fit  to  the  synthetic  data  using  OLS  optimization  procedure  to 
estimate  0  = 
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VRE  Patients  in  Isolation  VRE  Patients  in  Isolation 


(a)  Level  of  noise:  <7=0.10 


(c)  Level  of  noise:  <7=0.50 


(b)  Level  of  noise:  <7=0.30 


Figure  12:  Model  (34)  fit  to  the  synthetic  data  using  OLS  optimization  procedure  to 
estimate  9  =  (j3). 
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VRE  Patients  in  Isolation  VRE  Patients  in  Isolation 


(a)  Level  of  noise:  <7=0.10 


(c)  Level  of  noise:  <7=0.50 

Figure  13:  Model  (34)  fit  to  the  synthetic 
estimate  0  =  (m,  j3) 


(d)  Level  of  noise:  <7=0.70 

using  OLS  optimization  procedure  to 
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9  Conclusions 


We  have  introduced  a  discrete  event  VRE  model  with  delay  that  incorporates  specific 
details  with  respect  to  the  isolation  procedure  employed  to  patients  in  a  hospital  unit.  We 
attempted  to  estimate  some  of  the  epidemiological  parameters  such  as  the  transmission 
rate  (/3)  and  the  fraction  (m)  of  patients  VRE  colonized  on  admission.  Results  suggested 
that  there  was  little  success  in  the  model  fits  to  the  data,  especially  when  trying  to 
estimate  6  =  ( m,/3 ).  We  followed  with  a  model  refinement  that  includes  additional 
detail  to  investigate  whether  an  improvement  in  the  model  fits  to  the  data  could  be 
achieved.  Results  revealed  no  improvement  in  the  model  fit  to  the  data.  To  ensure  that 
our  inverse  problem  calculations  were  not  the  source  of  difficulty,  we  used  simulated  data 
(produced  by  the  model  with  added  noise)  to  validate  our  methodology  and  our  ability 
to  successfully  estimate  parameters  in  the  model.  Thus  we  demonstrated  that  if  we  have 
quality  data  for  which  the  model  is  a  reasonable  approximation,  we  can  successfully 
carry  out  the  inverse  problem,  even  with  very  noisy  data. 

We  conclude  that  the  VRE  surveillance  data  appears  to  be  very  irregular  as  it  does 
not  seem  to  follow  any  standard  process  and  contains  an  extemely  high  level  of  variability. 
One  reason  that  the  data  has  little  to  do  with  the  process  could  be  due  to  the  fact  that 
there  is  not  a  true  parameter  9q  in  the  statistical  model,  Yj  =  f(tj,0Q )  +  £j,  that 
can  generate  the  data  we  have  from  this  hospital  unit.  The  assumption  of  existence 
of  such  a  9q  is  an  essential  foundation  of  much  of  the  statistical  and  mathematical 
methodology  for  inverse  problems  (see  [9]).  Another  possible  reason  could  be  that  it 
is  likely  that  the  underlying  assumption  for  statistical  model  should  be  modified  to 
Yj  =  f(tj,6o)  +  g(tj,0o)£j,  where  we  do  not  know  g(tj,9).  Another  (more  likely  we 
believe)  possibility  is  that  there  is  no  underlying  model  f{tj,9o)  that  describes  this 
data.  Rr  other  words,  this  data  is  completely  irregular  due  to  inherent  irregularities 
under  data  reporting.  Thus  we  suspect  that  the  quality  of  the  data  is  not  sufficient 
for  model  development.  R  is  therefore,  in  the  terminology  introduced  earlier,  a  prime 
example  of  “cold  data”  which  is  not  adequate  for  model  development  and  validation. 

The  models  developed  here  involve  a  data  collection  process  as  described  to  us  by 
the  health  workers.  Thus  we  suggest  that  there  are  irregularities  in  reporting  or  in 
their  descriptions  of  what  they  actually  do.  This  claim  is  supported  in  Section  3.5 
where  we  described  inconsistencies  in  the  data  reporting.  Any  future  efforts  on  model 
development  would  require  a  much  more  careful  design  and  implementation  of  the  data 
collection  process. 
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